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BACKGROUND OF THE INVENTION 

1 1. Technical Field 

2 The present invention pertains to systems facilitating secure network communications . 

3 In particular, the present invention pertains to an apparatus or system facilitating secure 

4 network communications for users accessing the network via voice responsive interfaces. 

5 2. Discussion of the Related Art 

6 Generally, computer systems are utilized to access and navigate through a 

7 communications network, such as the Internet. These computer systems each typically include 

8 an input device (e.g., keyboard, mouse, etc.) and network navigation software (e.g., abrowser) 

9 to traverse the network and communicate with various network sites. In order to prevent 
10., unauthorized access to information transmitted over the network, secure communication 

1 1 techniques may be employed that typically utilize certificates and private keys to verify a user 

1 2 identity and encrypt transferred information. The certificate is issued to a user by a certificate 

1 3 authority that basically ensures the identity of the particular user receiving the certificate. The 

14 certificate includes a public key and other identification information for a user and is stored 

1 5 along with a private key on a user computer system. When information is requested by the 

1 6 user computer system from a secure network site or web server system, a secure key exchange 

17 is negotiated between the user computer system and web server system typically utilizing a 

1 8 public-key/private-key scheme. Basically, certificates for the user computer system and web 

1 9 server system are initially transferred to provide the respective systems with the other ' s public 

20 key for performing the key exchange. The key exchange makes use of shared data plus public 

21 and private keys to allow both participants in the connection to generate a common secure 

22 session key. The key exchange results in a session key that is used to encrypt and decrypt 

23 subsequent information transferred between the user computer system and web server system 
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1 to provide a secure session. An exemplary computer protocol for this type of secure 

2 information transference is the Secure Sockets Layer (SSL) protocol. 

3 Further, network sites may employ security measures to verify authorized users and 

4 control access to the sites and/or site information. One such technique includes utilization of 

5 account or user names and corresponding passwords to control access to a network site. This 

6 technique may be utilized for each independent site, or a single user name and password may 

7 be utilized for multiple sites. For example, the Passport protocol, available from Microsoft 

8 Corporation, permits access to multiple web sites based on a single user name and password. 

9 The protocol includes a client computer system employing a browser, a merchant server and 

10 a protocol server. The protocol server maintains authentication and profile information for a 

1 1 client and provides the merchant server with access to this information when permitted by the 

1 2 client. In operation, a client, via the client computer system, accesses a merchant site requiring 

13 client authorization. The client system is redirected by the merchant server to the protocol 

14 server where the client provides the appropriate user name and password (e.g., "logs in" to 
1 5 " ~ the protocol server). This interaction utilizes the Secure Sockets Layer (SSL) protocol. The 

1 6 protocol server redirects the client system to the merchant site and provides the client system 

17 . with encrypted authentication information for that site. The authentication information is 

1 8 encrypted using a triple Data Encryption Standard (DES) technique having a key previously 

1 9 established between the merchant server and protocol server. The merchant server verifies the 

20 client based on the authentication information, and stores an encrypted file (e.g., cookie file) 

21 in the client system to enable authentication of the client by the merchant server for 

22 subsequent visits to that site (e.g., without repeating the protocol server login procedure). In 

23 addition, the protocol server similarly stores an encrypted file (e.g., cookie file) on the client 

24 system to enable authentication of the client by the protocol server for other sites (e.g., 

25 without repeating the login procedure). The Passport protocol further enables a client to 

26 provide personal and credit card information for selective transfer to multiple servers for 

27 purchasing products over the network. 

28 In addition, voice verification may be utilized in various systems and may be 

29 implemented by varying techniques to provide appropriate security. For example, co-pending 

30 U.S. Patent Application Serial No. 08/960,509, entitled "Voice Authentication System" and 

3 1 filed October 28, 1 997, discloses a speaker authentication system operable in first and second 
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1 modes. The first mode facilitates enrollment of users, while the second mode verifies that a 

2 person is a particular authorized user. The system includes a user interface and a verification 

3 module. The interface facilitates communication between a user and the verification module 

4 and operates in the first mode to prompt the user to utter a first set of phrases for enrolling the 

5 user. The user interface further prompts a user seeking verification in the second mode to utter 

6 a randomized second set of phrases corresponding to the first phrase set. The verification 

7 module generates voice models corresponding to the first set of speech utterances received 

8 from the user in the first mode and compares the voice models in the second mode to the 

9 randomized second set of speech utterances to verify that the user is a particular authorized 

10 user. The system may control remote computer access or access to information on network 

1 1 sites based on verification of user utterances. 

12 U.S. Patent No. 5,339,385 (Higgins) discloses a speaker verification system that 

1 3 accepts or rej ects the claimed identity of an individual based on an analysis of the individual ' s 

1 4 utterances. The individual is prompted to speak test phrases selected randomly and composed 

1 5 of words from a small vocabulary. The system determines nearest-neighbor distances b etween 

16 speech frames derived from the spoken test phrases and speech frames of corresponding 

17 vocabulary words from previously stored utterances of an enrolled speaker. In addition, 

1 8 - distances between the spoken test phrases and corresponding vocabulary words for a set of 

19 reference speakers are determined by the system. The claimed identification is accepted or 

20 " rejected based on the relationship of the determined distances to a predetermined threshold. 

21 U.S. Patent No. 5,414,755 (Bahler et al) discloses a method for passive' voice 

22 verification in a telephone network. A telephone long distance service is provided using 

23 speaker verification to determine the validity of a user. The user claims an identity by 

24 providing an identification, typically a calling card number, to a telephone. Unrestricted, 

25 extemporaneous speech of a group of customers are digitized, analyzed and characterized as 

26 a non-parametric set of speech feature vectors. The extemporaneous speech of the user is 

27 digitized and analyzed in a similar manner. The user identity is verified by comparing a 

28 reference utterance of a known customer with utterances from one or more unknown users, 

29 one of which is the user claiming the identity of a known customer. The comparison results 

30 in a decision to accept or reject the claimed identity, where the identity to be tested is derived 

3 1 from the calling card number. 
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1 U.S. Patent No. 5,806,040 (Vensko) discloses a speech controlled verification system 

2 for verifying the identity of a person using a telephone calling card, bank card or other credit 

3 card. The system connects the person to a telephone network to enter the card number. The 

4 card number is utilized to access a central database and retrieve a voice verification template 

5 corresponding to the entered card number. The system prompts the user to state one of the 

6 words, phrases and/or numbers contained in the retrieved voice verification template, and 

7 compares the stated words to the template. If the stated words match the template, the user is 

8 considered to be an authorized user and the card is validated. 

9 U.S. Patent No. 5,937,781 (Huang et al) discloses a voice verification system for 

10 telephone transactions. The system includes a mechanism to prompt the user to speak in a 

1 1 limited vocabulary, and a feature extractor that converts the limited vocabulary into a plurality 

1 2 of speech frames. A pre-processor is coupled to the feature extractor for processing the speech 

1 3 frames to produce a plurality of processed speech frames, while a frame label is assigned to 

1 4 each speech frame via a Viterbi decoder. The processed frames and frame labels are combined 

15 to produce a voice model that is compared to an authorized user voice model derived during 

1 6 a previous enrollment session. The user voice model is further compared with an alternative 

1 7 voice model set derived during previous enrollment sessions. The claimed identity is accepted 

1 8 when the user voice model more closely resembles the authorized user voice model than the 

1 9 alternative voice model set. 

20 Voice technology may further be employed by network browsers to interact with users 

2 1 and enhance accessibility to networks. With respect to the Internet, voice responsive browsers 

22 permit users to call an Internet Service Provider (ISP) via telephone and navigate the Internet 

23 by voice commands. Web pages retrieved by a voice responsive browser generally include 

24 extended definitions to enable the voice responsive browser to process those pages. The 

25 definitions provide the web page audio to synthesize for transmission to a caller and the 

26 appropriate speech to receive from the caller in response to a retrieved web page. Thus, the 

27 voice responsive browser basically provides audio to a caller to describe actions for a web 

28 page and performs commanded actions in response to appropriate voice commands from the 

29 caller with respect to that web page (e.g., relating to web page buttons or other selections). An 

30 example of a voice browser is disclosed in U.S. Patent No. 5,9 1 5,00 1 (Uppaluru). 
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1 The related art suffers from several disadvantages. In particular, when a telephone or 

2 similar device is utilized by a user to access a network via a voice responsive interface, the 

3 user does not have a computer system or memory for storing security information, such as a 

4 certificate and/or private key. This precludes use of the above-described techniques for secure 

5 network communications and restricts the network activities and navigational capabilities of 

6 the user. Although security information may be stored remotely, this exposes the security 

7 information to an increased risk of misappropriation, thereby allowing unauthorized users to 

8 improperly obtain security privileges to secure network sites and information. 

9 OBJECTS AND SUMMARY OF THE INVENTION 

10 Accordingly, it is an object of the present invention to facilitate secure network 

1 1 communications for network access via voice responsive network interface devices. 

12 It is another object of the present invention to facilitate secure network 

1 3 ~ communications with security information remotely stored from a user. 

1 4 Yet another object of the present invention is to control access to security information 

1 5 : that facilitates secure network communications by verifying the identity of a caller based on 

1 6 caller speech signals. 

17 The aforesaid objects are achieved individually and in combination, and it is not 

1 8 intended that the present invention be construed as requiring two or more of the objects to be 

1 9 combined unless expressly required by the claims attached hereto. 

20 According to the present invention, a system for facilitating secure network 

21 communications includes a security computer system and corresponding software. The 

22 security system is utilized in conjunction with a voice responsive network navigation tool or 

23 browser residing on a server system. A user accesses the network by placing a call to the voice 

24 browser system. The voice browser includes a software module in the form of a "plug-in" that 

25 creates a secure connection to the security computer system. The user provides an 

26 identification to the voice browser system in the form of speech or touch tone signals that is 

27 transferred to and verified by the security system. Once the identification is verified, the user 

28 is prompted by the voice browser system to speak a phrase for voice verification. The 

29 verification speech signals are transferred from the voice browser system to the security 

30 system to verify the transferred speech signals against speech signals of a particular authorized 

3 1 user associated with the identification and stored in a database. When the user is verified, the 
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1 security system retrieves a user private key and certificate from the database to provide secure 

2 network communications. In particular, when the user subsequently accesses a network or 

3 web site residing on a secure server and employing security measures, the secure server and 

4 voice browser system negotiate a secure key exchange as described above. During the key 

5 exchange process, data packets containing security information are transferred from the voice 

6 browser system to the security system for processing, while security information from the 

7 security system is transferred to the secure server via the voice browser system. The resulting 

8 session key is securely transferred from the security system to the voice browser system to 

9 facilitate secure communications between the voice browser system and secure server as 

10 described above. 

1 1 The abov e and still further objects, features and advantages of the present invention 

1 2 will become apparent upon consideration of the following detailed description of specific 

13 embodiments thereof, particularly when taken in conjunction with the accompanying 

14 drawings, wherein like reference numerals in the various figures are utilized to designate like 

1 5 components. 

16 - BRIEF DESCRIPTION OF THE DRAWINGS 

1 7 Fi §- 1 is a diagrammatic illustration of a voice interface security system according to 

18 the present invention coupled to a network and a voice responsive network access device to 

1 9 facilitate secure communications over that network for voice access. 

20 Fi S- 2 is a procedural flow chart illustrating the manner in which the security module 

2 1 of the voice interface security system of Fig. 1 facilitates verification of a caller according to 

22 the present invention. 

23 Fi §- 3 is a procedural flow chart illustrating the manner in which the security system 

24 of the voice interface security system of Fig. 1 verifies a caller according to the present 

25 invention. 

26 Fi §- 4 is a flow diagram illustrating the manner in which the voice interface security 

27 system of Fig. 1 verifies a caller based on caller voice signals according to the present 

28 invention. 

29 Fi §- 5 is a Procedural flow chart illustrating the manner in which the security module 

30 of the voice interface security system of Fig. 1 facilitates secure communications between the 

3 1 voice browser system and a secure network server according to the present invention. 
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1 Fig. 6 is a procedural flow chart illustrating the manner in which the security system 

2 of the voice interface security system of Fig. 1 facilitates secure communications between the 

3 voice browser system and a secure network server according to the present invention. 

4 Fig. 7 is a flow diagram illustrating the manner in which the voice interface security 

5 system of Fig. 1 facilitates secure communications between the voice browser system and a 

6 secure web server according to the present invention. 

7 DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

8 An exemplary network topology employing secure network communications for voice 

9 access in accordance with the present invention is illustrated in Fig. 1. Specifically, the 

1 0 topology includes a voice browser system 2, a security system 4, a network 1 0 (e.g., the World 

1 1 Wide Web or Internet), a secure web server 12 and an end-user system 14. The secure web 

12 server typically contains a desired network site and may be accessed by end-user system 14 

1 3 and/or voice browser system 2 via network 1 0. Voice browser system 2 is coupled to network 

14 10 and provides access to the network via voice commands received from a conventional 

1 5 telephone 1 6. The telephone may alternatively be implemented by any conventional or other 

16 communications device (e.g., cellular telephone, wired or land based telephone, wireless 

17 communication devices, computer system with audio input device, such as a microphone, 

18 etc.). Thus, a user may access the voice browser system and utilize voice commands to 

1 9 navigate and/ or interact with the network via any telephone or other communications device. 

20 Security system 4 is coupled to a database 8 and voice browser system 2 to facilitate secure 

2 1 communications with secure web server 1 2 over network 1 0 when a user accesses the network 

22 by telephone 16 as described below. 

23 End-user system 14 is coupled to network 10 and may access secure web server 12 

24 based on manual user entry of data (e.g., keyboard, mouse, etc.) or based on voice signals via 

25 voice browser system 2. The end-user system may be implemented by any conventional or 

26 commercially available personal or other type of computer system (e.g., IBM-compatible, 

27 Apple, Sun, desktop, laptop, PDA, etc.) preferably equipped with a display or monitor 18, a 

28 base 20 (i.e., including the processor, memories and internal or external communications 

29 devices (e.g., modem, network cards, etc.)), a keyboard 22 and optional input devices, such 

30 as mouse 24 and/or microphone 26. The end-user system includes software (e.g., operating 

31 system, Internet browser or other network navigation tool, etc.) to communicate with secure 
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1 server 12 and/or voice browser system 2, and appropriate components (e.g., processor, disk 

2 storage or hard drive, RAM, etc.) having sufficient processing and storage capabilities to 

3 effectively execute the software. The end-user system may utilize any of the maj or platforms 

4 (e.g., Linux, Macintosh, Unix, OS2, Windows, etc.). Similarly, secure server system 12 is 

5 typically implemented by a conventional personal or other suitable computer system 

6 preferably equipped with a display or monitor, a base (i . e. , including the processor, memories 

7 and internal or external communication devices (e.g., modem, network cards, etc.)), a 

8 keyboard and optional mouse or other input devices. The secure web server system includes 

9 software (e.g., operating system, server software, etc.) to communicate with end-user system 

10 14 and voice browser system 2, and appropriate components (e.g. , processor, disk storage or 

1 1 hard drive, RAM, etc.) having sufficient processing and storage capabilities to effectively 

12 execute the software. The server system may utilize any of the commercially available 

13 operating systems and/or server software. 

14" In operation, end-user system 1 4 typically stores a user certificate and private key to 

1 5 2 facilitate communications with a secure network site, such as sites residing on secure server 

16 12. In particular, when end-user system 14 requests a web page from secure server 12, a 

1 7 secure key exchange is negotiated between the end-user system and secure server by using the 
1 8 : private and public keys of the respective systems as described above. The user private key is 

1 9 stored on end-user system 1 4 and access is typically protected by a password or other security 

20 technique. The negotiation results in a session key that is utilized to encrypt and decrypt 

2 1 transmissions over the network for a session as described above. 

22 However, when network 10 is accessed via telephone 16, a user has no local storage 

23 mechanism to store the user private key and facilitate secure communications. Accordingly, 

24 the present invention facilitates secure communications over the network and access to secure 

25 network sites via telephone 16. Specifically, the present invention includes security system 

26 4 and a security module 6 that in combination provide the security measures to enable a user 

27 to access and facilitate secure communications with secure web server 12 via voice browser 

28 system 2 and telephone 16. Initially, the voice browser system may be implemented by any 

29 conventional or other suitable computer system preferably including a base having the 

30 processor, memories and internal or external communications devices (e.g., modem, network 

3 1 cards, telephony equipment to answer user calls, etc.)). The voice browser system may 
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1 optionally include a display or monitor, a keyboard, a mouse or other input or display devices. 

2 The voice browser system includes a voice responsive browser 3 and other software (e.g., 

3 operating system, server software, etc.) to communicate with a user via telephone 1 6 and with 

4 network 10, and appropriate components (e.g., processor, disk storage or hard drive, RAM, 

5 etc.) having sufficient processing and storage capabilities to effectively execute the software. 

6 The voice browser system may utilize any of the commercially available operating systems 

7 and basically functions as a server (e.g., for an Internet Service Provider (ISP)). The voice 

8 responsive browser basically facilitates navigation and interaction with network 10 in 

9 response to user voice commands and provides audio to the user in order to describe network 

10 sites to the user. The voice browser may be implemented using any conventional or 

1 1 commercially available products or techniques, such as those disclosed in U.S. Patent No. 

12 5,915,001 (Uppaluru), the disclosure of which is incorporated herein by reference in its 

13 entirety. A user accesses the voice browser system by placing a telephone call to the voice 

1 4 browser system via telephone 1 6 and utilizing voice commands to instruct the voice browser 

15 to navigate network 10. 

16 Security system 4 and security module 6 in combination facilitate communications 

1 7 between the voice browser system and secure web server 12. The security module is typically 

18 implemented in software in the form of a "plug-in" (e.g., software that extends the basic 

1 9 features of a software package and functions as though it is part of that package) for the voice 

20 responsive browser and resides on the voice browser computer system. The security module 

21 communicates with the security system to verify a caller identity and to receive negotiated 

22 session keys for communicating with secure server 12 as described below. The security 

23 system is typically implemented by a conventional personal or other suitable computer system 

24 preferably equipped with a display or monitor, a base (i.e., including the processor, memories 

25 and internal or external communication devices (e.g., modem, network cards, router, etc.)), 

26 a keyboard and optional mouse or other input devices. The security system includes software 

27 (e.g., operating system, communications software, database access software, security software, 

28 etc.) to communicate with voice browser system 2 and access database 8, and appropriate 

29 components (e.g., processor, disk storage or hard drive, RAM, etc.) having sufficient 

30 processing and storage capabilities to effectively execute the software. The security system 

3 1 may utilize any of the commercially available operating systems. 
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1 The security system and voice browser system may reside on the same local network, 

2 or alternatively may communicate with each other over network 10 via another socket or 

3 communication channel, or via any other suitable communications medium (e.g., LAN, WAN, 

4 private network, etc.). The security system is preferably implemented by an individual 

5 computer system, but may alternatively reside with the security module on the voice browser 

6 system. Database 8 may be implemented by any conventional or other database or storage 

7 system, and typically contains voice information and user certificates and private keys to 

8 respectively verify caller identities and negotiate a secure session as described below. 

9 The security module basically initiates user verification and identifies security relevant 

1 0 information for forwarding to the security system. The security module in combination with 

1 1 the security system provide users of voice browsers with secure access to their certificates and 

12 private keys stored in database 8. The security system negotiates the session key for use 

1 3 between the voice browser and a secure server web site. Thus, the present invention reduces 

1 4 risks associated with remote storage of private keys. The manner in which the security module 

1 5 _ performs verification functions is illustrated in Fig. 2. Initially, a caller places a telephone call 

16 to voice browser system 2 at step 30 via telephone 1 6. The voice browser executes the security 

17 module in response to the user connection or telephone call at step 32. Specifically, the 

1 8 security module establishes a secure connection (e.g., encrypted) to the security system at step 

19 34 via network 1 0 or other communications medium, and prompts the caller to speak or enter 

20 an identification (e.g., name, number, etc.) at step 36. The security module and security 

2 1 system may each include certificates to establish a secure session between them. The voice 

22 or touch tone signals provided by the caller representing the caller identification are retrieved 

23 and processed (e.g., digitized, etc.) by the voice browser system and subsequently forwarded 

24 to the security system for verification at step 38. If the caller identification is invalid as 

25 determined at step 40, appropriate action is taken at step 42 (e.g., log as security breach, enroll 

26 caller as new user, disconnect, etc.). 

27 W^ 11 me security system has determined that the caller identification is valid at step 

28 40, the security system via the security module initiates a voice verification session with the 

29 caller at step 44. In particular, the security module prompts the caller to speak a pass phrase, 

30 a random digit string or any combination of these two items. The spoken verification phrase 

31 is retrieved and processed (e.g., digitized, etc.) by the voice browser system and forwarded 
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1 to the security system for verification of the caller's identity at step 46. The verification results 

2 are forwarded to the voice browser system and are provided to the caller at step 48. If the 

3 caller's voice signals correspond with the voice information associated with the caller 

4 identification and stored in database 8 as determined at step 50, the user certificate and private 

5 key are retrieved by the security system from the database at step 54 to initiate 

6 communications with a secure site. When the caller is not verified as determined at step 50, 

7 and the quantity of attempts is less than a predetermined limit as determined at step 52, the 

8 caller is prompted to repeat the verification phrase at step 44. If the caller is not verified 

9 within the predetermined quantity of attempts as determined at step 52, access to the user 

1 0 security information (e.g., certificate and private key) is denied and/or other appropriate action 

11 is taken (e.g., the connection is terminated) at step 56. Once the caller is verified and the 

12 appropriate information is retrieved from database 8, the security system is in a state to 

1 3 facilitate secure communications for the voice browser system as described below. 

1 4 Tne manner in which the security system receives and processes information from the 

15 security module to verify a caller is illustrated in Fig. 3. Initially, the security system 

16 establishes a secure connection with the voice browser system at step 60, and receives the 

17 spoken or entered identification from the security module at step 62. The identification is 

1 8 verified by the security system at step 64 by comparing the spoken or touch tone identification 

1 9 to valid identifications stored in database 8 . If the identification does not reside in the database 

20 as determined at step 66, the security system may record the event as a possible security 

2 1 violation at step 68 and attempt to enroll the caller as a new user or perform other appropriate 

22 actions at step 70. After such actions are completed, the caller may re-attempt access or have 

23 the connection terminated. 

24 When ^ identification is valid as determined at step 66, voice information 

25 corresponding to the valid identification is retrieved by the security system from the database 

26 at step 72 in order to verify the caller identity. Initially, authorized users are enrolled with the 

27 security system and assigned an identifier (e.g., a user name or identification). During the 

28 enrollment process, each user is prompted to speak particular vocabulary words. The spoken 

29 words for each user are verified against the user prompts to ensure validity of the spoken 

30 words, and are subsequently processed for storage in database 8 based on the corresponding 

3 1 user identification. The stored words of enrolled users are later retrieved based on the user 
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1 identification provided by a caller for comparison with spoken words of the caller to verify 

2 the caller identity. The voice browser and security systems may utilize various techniques to 

3 process voice signals received from users, enroll users and verify users, such as those 

4 disclosed in U.S. Patent Nos. 5,339,385 (Higgins) and 5,937,381 (Huang et al) and co- 

5 pending U.S. Patent Application Serial No. 08/960,509, entitled "Voice Authentication 

6 System" and filed October 29, 1997. The aforementioned patents and patent application are 

7 incorporated herein by reference in their entireties. 

8 The security system retrieves vocabulary words within the enrollment data at step 74 

9 for forwarding to the voice browser system. The voice browser system prompts the caller to 

1 0 speak the selected words, and the spoken words are received, processed (e.g., digitized, etc.) 

1 1 and forwarded by the voice browser system to the security system via a secure connection 

12 (e.g., encrypted) at step 76. The spoken words are verified against the enrollment voice 

1 3 information retrieved from the database and associated with the provided user identification 

14 at step 78. This may be accomplished utilizing various techniques, such as those disclosed in 

1 5 the above-mentioned Higgins (5,339,385) and Huang et al patents (5,937,38 1) andU.S. Patent 

16 Application Serial No. 08/960,509. The resulting determination is forwarded to the voice 

17 browser system at step 80 to inform the caller of the verification results. If the caller is 

1 8 verified as determined at step 82, the user certificate and private key are retrieved from the 

19 database at step 84 to facilitate communications with a secure network site. Otherwise, the 

20 caller may re-attempt access or have the connection terminated. 

2 1 Operation of the system and overall interaction between a caller, voice browser system 

22 and security system to verify the caller is diagrammatically illustrated in Fig. 4. Specifically, 

23 the caller initially places a telephone call to the voice browser system at flow 90 and the 

24 security module is initiated in response to the call at flow 92. The security module 

25 subsequently establishes a secure connection with the security system at flow 94, and prompts 

26 the caller to speak or enter a user identification at flow 96. The caller provides the user 

27 identification at flow 98, and the security module requests voice verification based on the user 

28 identification from the security system at flow 100. The security system validates the 

29 identification and retrieves a user profile associated with the identification and including voice 
3 0 information at flow 1 02 . An identifying phrase is selected and the security system requests the 
3 1 security module at flow 1 04 to prompt the caller to speak the phrase at flow 1 06. The caller 
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1 responds to the prompt at flow 1 08 and the spoken phrase is forwarded to the security system 

2 at flow 1 1 0 for verification at flow 1 1 2. The resulting verification determination is forwarded 

3 through the security module at flow 1 14 to the caller at flow 116. If the caller identity is 

4 verified, the user private key and certificate are obtained from the database at flow 118. 

5 Once a caller has been verified and the corresponding private key and certificate 

6 retrieved from the database, the security system is enabled to negotiate sessions with a secure 

7 network site. The security module in the voice browser system basically intercepts security 

8 relevant information received from a secure network site and forwards that information from 

9 the voice browser system to the security system in order to enable the security system to 

1 0 negotiate the session key with the secure network site. Once the session key is negotiated by 

1 1 the security system and provided to the voice browser system, the voice browser system and 

1 2 secure web server communicate in an encrypted fashion with the session key in substantially 
13/ the same manner described above for end-user system 14. 

14 Initially, the parameters of a session are determined by a handshake protocol. 

15 ~i Basically, the handshake protocol enables the security system and secure web server to 

16 " negotiate a protocol version, select cryptographic algorithms, and authenticate each other, 

17 while public key encryption techniques are utilized to transfer confidential information. By 

18 way of example only, the handshake protocol employed by the present invention is the 

19 handshake protocol utilized by the Secure Sockets Layer (SSL) protocol. However, any 

20 suitable handshake scheme or protocol may be utilized. Specifically, when a user navigates 

2 1 ~- to a secure site, the voice browser system initially sends to the secure web server a client hello 

22 message. The secure web server must respond with a server hello message to prevent a fatal 

23 error and connection failure. These hello messages are basically utilized to establish security 

24 enhancement capabilities between the security system and secure web server and various 

25 attributes (e.g., protocol version, session identification (ID), cipher suite (e.g., the 

26 cryptographic techniques utilized by the security system and secure web server, typically 

27 including a cipher spec (described below) and key exchange algorithm), compression method, 

28 etc.). Subsequent transmission of the client and server hello messages, the secure web server 

29 transmits to the voice browser system its certificate for authentication, a server key exchange 

30 message, a request for a certificate and a hello done message (e.g., indicating completion of 

31 the initial phase of the handshake protocol). The security system receives this information 
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1 from the voice browser system and responds with the certificate and a key exchange message, 

2 the format of which is dependent upon the public key algorithm selected between the security 

3 system and secure web server. Moreover, the security system transmits a change cipher spec 

4 message (e.g., having a protocol specific format) to the secure web server via the voice 

5 browser system and copies a pending cipher spec into a current cipher spec. The cipher spec 

6 basically contains information describing the manner of encryption for the session, while the 

7 change cipher spec message indicates that subsequent information will be transmitted in 

8 accordance with the negotiated cipher spec and keys. The security system sends, via the voice 

9 browser system, a finished message (e.g., indicating completion of the handshake) to the 

1 0 secure web server encrypted in the negotiated manner (e.g., in accordance with the new cipher 

1 1 spec). The secure web server similarly sends a change cipher spec message to the security 

1 2 system via the voice browser system, transfers the pending cipher spec to the current cipher 

1 3 spec and transmits a finished message encrypted in the negotiated manner (e.g. , in accordance 

14 with the new cipher spec). Upon receipt of the respective finished messages, the security 

1 5 system forwards the negotiated information to the voice browser system to enable the voice 

16 browser system and secure web server to transmit data in accordance with the negotiated 

1 7 cipher spec information. 

18 When the voice browser system and secure web server intend to resume a prior 

1 9 session, the above process may be abbreviated. Generally, the voice browser system transmits 

20 a client hello message containing the session ID of the session to be resumed. The secure web 

21 server verifies the session ID and, if the secure web server desires to resume that session, 

22 transmits a server hello message to the security system via the voice browser system with that 

23 session ID value. The security system and secure web server transmit change cipher spec 

24 messages and subsequently send their respective finished messages as described above. The 

25 negotiated information is forwarded from the security system to the voice browser system to 

26 enable transfer of information between the voice browser system and secure web server in the 

27 encrypted manner negotiated for that session as described above. If the secure web server does 

28 not verify the session ID, or does not desire to resume the session, the handshake protocol 

29 described above is repeated to establish a new session. 

30 The security module intercepts security relevant messages received by the voice 

31 browser system from the secure web server and forwards them to the security system for 
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1 processing as illustrated in Fig. 5. Initially, a user navigates to the secure web server at step 

2 120 after voice verification as described above. The voice browser system initiates a 

3 handshake with the secure web server in substantially the same manner described above. The 

4 voice browser system receives security messages from the secure web server and the security 

5 module detects these messages and forwards them to the security system for processing at step 

6 122. The security system processes the messages and transmits reply messages to the voice 

7 browser system for forwarding to the secure web server at step 1 24. Specifically, the security 

8 system accesses the user private key and certificate and provides the appropriate information 

9 to the voice browser system for transmission to the secure web server as described above. The 

1 0 security system further processes the messages to determine the negotiated key and forwards 

11 that key to the voice browser system at step 126 to enable the voice browser system to 

12 communicate with the secure web server via the negotiated key in the manner described above 

1 3 for end-user system 14. 

1 4 Tft e manner in which the security system negotiates session parameters with the secure 

1 5 web server is illustrated in Fig. 6. Initially, the security system receives security messages 

16 from the security module at step 130 in response to the user navigating to the secure web 
17. server and enabling initiation of the handshake protocol as described above. The security 

1 8 system processes the messages and retrieves the user private key and certificate at step 1 32 

19 to generate responses and provide information (e.g., certificates, key exchange data, etc.) 

20 requested in the messages to the voice browser system for forwarding to the secure web server 

2 1 as described above. The appropriate messages are generated at step 1 34 and sent to the voice 

22 browser system for forwarding to the secure web server. When a negotiated session key is 

23 determined from the messages by the security system, the key is forwarded to the voice 

24 browser system at step 136 to enable the voice browser system to communicate with the 

25 secure web server in an encrypted manner via the negotiated key in substantially the same 

26 manner described above for the end-user system. 

27 The operation and overall interaction between the user, voice browser system, security 

28 system and secure web server are diagrammatically illustrated in Fig. 7. Initially, a user, 

29 subsequent to voice verification, navigates via the voice browser to a site on secure server 1 2 

30 as indicated at flow 140. The voice browser sends a client hello message to the secure server 

31 at flow 142. The secure server responds by transmitting to the voice browser at flow 144 a 



15 



1 server hello message, server certificate, server key exchange message, a certificate request 

2 and a hello done message as described above. The server messages are received by the voice 

3 browser and forwarded to the security system via the security module as indicated by flows 

4 146, 148. The security system accesses the user private key and certificate and generates 

5 handshake messages (e.g., client certificate, key exchange message, certificate verify message, 

6 etc.) in response to the server messages at flow 150. The response messages are forwarded to 

7 the voice browser via the security module as indicated at flows 152,153 and are transmitted 

8 by the voice browser to the secure server at flow 1 54. The resulting negotiated session key is 

9 forwarded from the security system to the voice browser via the security module at flows 1 56, 

10 158. The voice browser may subsequently provide the user with access to the secure site via 

1 1 the session key as indicated at flow 160, while information is transmitted between the voice 

1 2 browser and secure site in an encrypted manner via the negotiated key at flows 1 62, 1 63 . The 

13 encryption and decryption may be performed utilizing any conventional or other 

14 encryption/decryption techniques. 

1 5 When the voice browser and secure server intend to resume a prior session, the above 

1 6 process may be reduced by utilizing the session ID. In particular, the voice browser transmits 

1 7 a client hello message containing the session ID of the session to be resumed. The secure 

1 8 server verifies the session ID, and if the secure server desires to resume that session, transmits 

1 9 a server hello message to the security system via the voice browser system with that session 

20 ID value. The security system and secure server transmit change cipher spec messages and 

21 subsequently send their respective finished messages as described above. The negotiated 

22 information is forwarded from the security system to the voice browser to enable transfer of 

23 information between the voice browser and secure server in the encrypted manner negotiated 

24 for that session as described above. If the secure server does not verify the session ID, or does 

25 not want to resume the session, the handshake protocol described above is repeated to 

26 establish a new session. 

27 It is to be understood that the voice browser system may receive voice commands 

28 through various devices and techniques, while the system operates as described above to 

29 facilitate secure communications. For example, voice commands may be received over 

30 network 10 from various sources (e.g., computer systems, telephone or communications 

3 1 systems, etc.). The voice browser system processes the voice signals and verifies the user via 
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1 the security system in substantially the same manner described above. Thus, end-user system 

2 1 4 or other system employing a microphone or other audio input device may enable a user to 

3 navigate a network in accordance with voice commands. The voice signals are processed and 

4 transmitted to the voice browser system via network 1 0. The voice browser system interacts 

5 with the user as described above and may facilitate communications to secure network sites. 

6 The software for the security module and security system is preferably developed in 

7 the 'C++' programming language, but may be implemented in any suitable computer 

8 language. It is to be understood that one of ordinary skill in the computer arts could develop 

9 the software of the security module and security system based on the functional descriptions 

1 0 contained herein and the flow charts and diagrams illustrated in the drawings. The present 

11 invention is not limited to the protocols described herein, but may utilize any protocols to 

12 establish a session with a server. 
It will be appreciated that the embodiments described above and illustrated in the 

14 drawings represent only a few of the many ways of implementing a method and apparatus to 

1 5 " facilitate secure network communications with a voice responsive network interface device. 
The computer systems of the voice browser, end-user, secure web server and security 

17 systems may be implemented by any quantity of any personal or other type of computer 

system (e.g., IBM-compatible, Apple, Macintosh, laptop, palm pilot, etc.). These computer 
systems may include any commercially available operating system (e.g., Windows, OS/2, 
Unix, Linux, etc.), any commercially available or custom software (e.g., server software, 

2 1 browser software, voice processing software, security module, security system software, etc.) 

22 and any types of input devices (e.g., keyboard, mouse, microphone, etc.). It is to be 

23 understood that the software of the security system and security module may be implemented 

24 in any desired computer language. The computer systems may alternatively be implemented 

25 by hardware or other processing circuitry. The various functions of the voice browser, security 

26 module, security system and database may be distributed in any manner among any quantity 

27 (e.g., one or more) of modules, computer or processing systems or circuitry where the 
2 8 computer systems may be disposed locally or remotely of each other and communicate via any 

29 suitable communications medium (e.g., LAN, WAN, Intranet, Internet, hardwire, modem 

30 connection, wireless, etc.). The software and/or algorithms described above and illustrated 
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1 in the flow charts and diagrams may be modified in any manner that accomplishes the 

2 functions described herein. 

3 The network may be implemented by any communications network or medium (e.g., 

4 LAN, WAN, Internet, Intranet, direct connection, modem connection, wireless, etc.). The 

5 voice browser and security systems may include any conventional or other communications 

6 devices, and may communicate over the network or any other communications medium (e.g., 

7 LAN, WAN, Intranet, Internet, hardwire, modem connection, wireless, etc.). The voice 

8 browser system may be accessed by any suitable communications device (e.g., cellular 

9 telephone, wired or land based telephone, wireless communication devices, computer system 

10 with audio input device, such as a microphone, etc.). 

1 1 The voice browser system may accommodate any quantity of users, and include any 

1 2 conventional or other web server, voice processing and voice browser software. The plug-in 

13 security module may alternatively be implemented as a separate stand-alone program or 

14 software module, or be included with the security system and/or voice browser software. 

1 5 Further, the secure web server may accommodate any type of web page or form, and provide 

1 6 any type of user interface to the end-user or voice browser systems. The user interface may 

1 7 provide or obtain any desired information from the user. 

18 The security system may access the database via any suitable communications 

19 medium, devices, query language or protocols. The database may be implemented by any 

20 quantity of conventional or other databases or storage structures (e.g., file, data structure, etc.), 

21 maybe arranged in any fashion and may store any desired information based on any identifiers 

22 or keys. The database may reside on any quantity of (e.g., one or more) computer or 

23 processing systems separate from the security system and disposed locally or remotely of each 

24 other and the security system. The database computer systems may communicate with each 

25 other and the security system via any suitable communications medium, devices and protocols 

26 (e.g., LAN, WAN, Intranet, Internet, hardwire, modem connection, wireless, etc.). The 

27 database may contain any desired information, and may reside on the voice browser and/or 

28 security systems. 

29 The voice browser, security module and security system software may be available 

30 individually or in any combination on arecorded medium (e.g., magnetic or optical mediums, 

3 1 magneto-optic mediums, floppy diskettes, CD-ROM, memory devices, etc.) for use on stand- 



18 



1 alone systems or systems connected by a network or other communications medium, and/or 

2 may be downloaded (e.g., in the form of carrier waves, packets, etc.) individually or in any 

3 combination to systems via a network or other communications medium. 

4 The database or other storage device may contain any quantity of any type of security 

5 information to facilitate secure network communications (e.g. , certificates, keys, parameters, 

6 etc.). The user identification may include any quantity of any types of alphanumeric or other 

7 characters or symbols, and may be entered by a user or caller in any manner (e.g., voice 

8 signals, telephone or other keypad, etc.). The voice information may include any quantity of 

9 words from any vocabulary, or any speech modeling data (e.g., speech sounds) to verify users. 

1 0 The voice information may be processed in any desired manner for storage in the database. 

11 The present invention may utilize any types of conventional or other protocols to negotiate 

1 2 secure session parameters and transfer information in a secure manner. These protocols may 

13 • utilize any conventional or other encryption and decryption techniques and any quantity of 

14 parameters (e.g., any quantity of keys, certificates, identifications, etc.). The caller voice 

15 signals may be processed in any manner, and any conventional or other techniques may be 

16 utilized to compare the caller voice signals to the stored voice information. The security 

1 7 system may request the caller to speak any quantity of any type of phonetic or speech sounds, 

1 8 words or phrases to verify the caller. The enrollment process and stored voice information 

1 9 identifying the caller may utilize any quantity of any type of phonetic or speech sounds or 

20 words or phrases that may be predetermined or selected by a user. 

2 1 The voice browser and security systems may take any appropriate actions in response 

22 to invalid identifications or a caller not being verified (e.g., prompt caller to re-enter 

23 information, terminate connection, enroll caller, log event, notify caller of error or invalidity, 

24 etc.). Further, these actions may be initiated after any quantity of user attempts to enter valid 

25 information and/or voice or speech signals. The present invention is not limited to the 

26 specific applications disclosed herein, but may be utilized in substantially the same manner 

27 described above to control access to information (e.g., passwords, URLs, computer addresses, 

28 street addresses, social security numbers, security information, encryption keys, etc.) and/or 

29 network sites. For example, the present invention may control access to passwords, URLs or 

30 computer addresses, to initiate an automatic login sequence or transference to the URL or 

31 address in response to voice verification. The functions of the voice browser, security 
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1 module, security system and database may be combined, separated and/or distributed in any 

2 manner among any quantity of software modules and/or computer systems. 

3 From the foregoing description it will be appreciated that the invention makes 

4 available a novel method and apparatus to facilitate secure network communications with a 

5 voice responsive network interface device wherein a security system facilitates secure network 

6 communications to a secure network site when a user accesses the network via a voice 

7 responsive network interface device. 

8 Having described preferred embodiments of a new and improved method and 

9 apparatus to facilitate secure network communications with a voice responsive network 

10 interface device, it is believed that other modifications, variations and changes will be 

1 1 suggested to those skilled in the art in view of the teachings set forth herein. It is therefore 

12 to be understood that all such variations, modifications and changes are believed to fall within 

13 the scope of the present invention as defined by the appended claims. 
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